Evaluation and Assessment of Speech Intelligibility on Pathologic Voices Based upon Acoustic Speaker Models
نویسندگان
چکیده
We describe a GMM-UBM-based evaluation system for pathologic voices that uses standard cepstral features. Per speaker one GMM is created and its components are used to create a so-called GMM supervector. The supervector of each speaker is labeled with the intelligibility values obtained by human evaluation and is used to train an SVR. We studied different GMM supervectors containing different GMM components. On a database of 85 pathologic speakers, we achieved a correlation between the automatic system and the expert listeners of r = 0.83 when using a 13312-dimensional supervector containing the values of the diagonal covariance matrices of 26-dimensional Gaussians.
منابع مشابه
Synthesis using Speaker Adaptation from Speech Recognition DB
This paper deals with the creation of multiple voices from a Hidden Markov Model based speech synthesis system (HTS). More than 150 Catalan synthetic voices were built using Hidden Markov Models (HMM) and speaker adaptation techniques. Training data for building a Speaker-Independent (SI) model were selected from both a general purpose speech synthesis database (FestCat;) and a database designe...
متن کاملUtterance Selection for Optimizing Intelligibility of TTS Voices Trained on ASR Data
This paper describes experiments in training HMM-based text-to-speech (TTS) voices on data collected for Automatic Speech Recognition (ASR) training. We compare a number of filtering techniques designed to identify the best utterances from a noisy, multi-speaker corpus for training voices, to exclude speech containing noise and to include speech close in nature to more traditionally-collected T...
متن کاملLanguage-independent automatic evaluation of intelligibility of chronically hoarse persons.
OBJECTIVE Automatic intelligibility assessment using automatic speech recognition is usually language specific. In this study, a language-independent approach is proposed. It uses models that are trained with Flemish speech, and it is applied to assess chronically hoarse German speakers. The research questions are here: is it possible to construct suitable acoustic features that generalize to o...
متن کاملExploration of acoustic correlates in speaker selection for concatenative synthesis
It is often di cult to determine the suitability of a speaker to serve as a model for concatenative text-to-speech synthesis. The perceived quality of a speaker's natural voice is not necessarily predictive of its (even relative) synthetic quality. The selection of female and male speakers on whom to base two synthetic voices for the new AT&T text-to-speech system was made empirically. Brief re...
متن کاملAutomated Intelligibility Assessment of Pathological Speech Using Phonological Features
It is commonly acknowledged that word or phoneme intelligibility is an important criterion in the assessment of the communication efficiency of a pathological speaker. People have therefore put a lot of effort in the design of perceptual intelligibility rating tests. These tests usually have the drawback that they employ unnatural speech material (e.g., nonsense words) and that they cannot full...
متن کامل